Major Feature Release - Antigravity Provider, Credential Prioritization & Enhanced OAuth #10

Mirrowel · 2025-11-27T09:16:47Z

📋 Summary

This PR introduces significant enhancements to the LLM API Key Proxy, including a new Antigravity provider with Gemini 3 support, intelligent credential prioritization, configurable rotation strategies, and a refactored OAuth architecture. These changes improve security, reliability, and expand model support while maintaining backward compatibility.

🎯 Major Features

1. 🚀 Antigravity Provider (New)

The most sophisticated provider implementation to date, supporting Google's internal Antigravity API with full support for cutting-edge models:

Supported Models:
- Gemini 2.5 (Pro/Flash) with thinkingBudget parameter
- Gemini 3 (Pro/Image-preview) - First implementation with full support
- Claude Sonnet 4.5 via Antigravity proxy
Advanced Features:
- Thought Signature Caching: Server-side caching of encrypted signatures for multi-turn Gemini 3 conversations
- Tool Hallucination Prevention: Automatic system instruction and parameter signature injection to prevent incorrect tool parameters
- Thinking Preservation: Caches Claude thinking content for consistency across conversation turns
- Automatic Base URL Fallback: Resilient endpoint switching (sandbox → production)
- Schema Cleaning: Handles Claude-specific tool schema requirements
- Unified Streaming/Non-streaming: Single code path for optimal performance
Configuration: Full OAuth 2.0 support with stateless deployment capability
File Logging: Optional transaction logging for debugging

Files Added:

src/rotator_library/providers/antigravity_provider.py (1,616 lines)
src/rotator_library/providers/antigravity_auth_base.py

2. 🎯 Credential Prioritization System

Intelligent credential tier detection and priority-based selection ensures optimal credential usage:

Provider-Level Priorities: Providers implement get_credential_priority() to return priority levels (1=highest, 10=lowest)
Model-Level Requirements: Providers implement get_model_tier_requirement() to specify minimum priority for models
Automatic Filtering: Client automatically filters incompatible credentials before requests
Priority-Aware Selection: UsageManager prioritizes higher-tier credentials within priority groups

Example Implementation (Gemini CLI):

Paid-tier credentials: Priority 1 (highest)
Free-tier credentials: Priority 2
Unknown tier: Priority 10 (lowest)
Gemini 3 models: Require priority 1 (paid tier only)

Benefits:

Ensures paid-tier credentials are used for premium models
Prevents failed requests due to tier restrictions
Optimal cost distribution
Graceful fallback if primary credentials unavailable

3. 🎲 Weighted Random Rotation

Configurable credential rotation strategy for enhanced security and unpredictability:

rotation_tolerance Parameter (default: 3.0):
- 0.0: Deterministic - always selects least-used credential (perfect balance)
- 2.0-4.0 (recommended): Weighted random with bias toward less-used credentials
- 5.0+: High randomness for maximum unpredictability
Formula: weight = (max_usage - credential_usage) + tolerance + 1
Security Benefits:
- Unpredictable selection patterns make rate limit detection harder
- Prevents fingerprinting while maintaining load balance
- Recommended for production with multiple credentials

Configuration:

client = RotatingClient(
    rotation_tolerance=3.0  # Weighted random (recommended)
)

4. 🔧 Enhanced Gemini CLI Provider

Significant improvements to Gemini CLI authentication and model support:

Improved Project Discovery:
- Enhanced onboarding flow for new users
- Better handling of free vs paid tier projects
- Automatic tier detection and caching
- Support for GEMINI_CLI_PROJECT_ID override
Gemini 3 Support:
- Full support for Gemini 3 models with thinkingLevel configuration
- Tool hallucination prevention via system instruction injection
- ThoughtSignature caching for multi-turn conversations
- Parameter signature injection into tool descriptions
Credential Prioritization: Automatic paid vs free tier detection and priority assignment

5. 🗄️ Provider Cache System (New)

Modular, shared caching system for provider conversation state:

Architecture:
- Dual-TTL design: short-lived memory cache + longer-lived disk persistence
- Background persistence with batched writes (60s interval)
- Automatic cleanup of expired entries (30min interval)
- Atomic disk writes to prevent corruption
Key Methods:
- store() / store_async(): Synchronous/async storage
- retrieve() / retrieve_async(): With disk fallback
- Statistics tracking (hits, misses, writes)
Use Cases:
- Gemini 3 ThoughtSignatures
- Claude thinking content
- Any provider-specific transient state

Files Added:

src/rotator_library/providers/provider_cache.py (498 lines)

6. 🔐 Refactored OAuth Architecture

Shared OAuth base class eliminates code duplication:

GoogleOAuthBase Class: Single source of truth for all OAuth logic
Benefits:
- Easy provider addition (override constants only)
- Consistent behavior across providers
- OAuth bugs fixed once apply everywhere
- Maintainability improved
Inherited Features:
- Automatic token refresh with exponential backoff
- Invalid grant re-authentication flow
- Stateless deployment support
- Atomic credential file writes
- Headless environment detection

Refactored Providers:

GeminiAuthBase → extends GoogleOAuthBase
AntigravityAuthBase → extends GoogleOAuthBase

Files Added:

src/rotator_library/providers/google_oauth_base.py (653 lines)

7. 🌡️ Temperature Override

Global temperature=0 override to prevent tool hallucination:

Modes:
- "remove": Deletes temperature=0 from requests
- "set": Changes temperature=0 to temperature=1.0
- "false": Disabled (default)
Configuration:

OVERRIDE_TEMPERATURE_ZERO=remove  # or "set"

8. 🛠️ Tool Improvements

Enhanced Credential Tool:

Added Antigravity OAuth credential support
Reorganized export menu (consolidated into submenu)
Export Antigravity credentials to .env format

Updated Launcher:

Support for new Antigravity provider
Updated credential discovery paths

📝 Documentation Updates

Major Documentation Changes:

DOCUMENTATION.md:
- Section 2.10: Credential Prioritization System
- Section 2.11: Provider Cache System
- Section 2.12: Google OAuth Base
- Section 3.5: Antigravity Provider (comprehensive)
- Updated Section 2.2: Weighted random rotation strategy
README.md:
- Added Antigravity provider documentation
- Updated credential tool instructions
- Added rotation_tolerance configuration
- Added temperature override documentation
- Added credential prioritization examples
Deployment guide.md:
- Added Antigravity OAuth setup instructions
- Stateless deployment guide for Antigravity
src/rotator_library/README.md:
- Updated library features list
- Added Antigravity provider details
- Credential prioritization documentation

🔧 Technical Changes

Client (`client.py`)

Added rotation_tolerance parameter to constructor
Integrated credential priority filtering in _make_completion_request and acompletion_stream
Build priority maps from provider plugins
Filter credentials by model tier requirements
Enhanced logging for priority-aware selection

Usage Manager (`usage_manager.py`)

Implemented weighted random credential selection
Added _select_weighted_random() method
Enhanced acquire_key() with priority group support
Credential selection statistics logging
Support for credential_priorities parameter

Provider Interface (`provider_interface.py`)

Added get_credential_priority() method
Added get_model_tier_requirement() method
Optional implementations for credential prioritization

Credential Manager (`credential_manager.py`)

Added Antigravity to DEFAULT_OAUTH_DIRS

Factory (`provider_factory.py`)

Registered AntigravityAuthBase provider

Proxy App (`main.py`)

Global temperature=0 override implementation
Enhanced reasoning parameter logging
Fixed role handling in streaming responses

🐛 Bug Fixes

Streaming Response Role Handling: Fixed role field concatenation bug (should always replace, never concatenate)
Gemini CLI Project Discovery: Enhanced error handling and tier detection logic
OAuth Credential Persistence: Improved atomic write operations
File Logging: Enhanced error handling and output formatting

📦 Dependencies & Configuration

New Configuration Options:

Antigravity Provider:

# Stateless deployment
ANTIGRAVITY_ACCESS_TOKEN="..."
ANTIGRAVITY_REFRESH_TOKEN="..."
ANTIGRAVITY_EXPIRY_DATE="..."
ANTIGRAVITY_EMAIL="..."

# Feature toggles
ANTIGRAVITY_ENABLE_SIGNATURE_CACHE=true
ANTIGRAVITY_GEMINI3_TOOL_FIX=true

Rotation Strategy:

ROTATION_TOLERANCE=3.0  # Weighted random (recommended)

Temperature Override:

OVERRIDE_TEMPERATURE_ZERO=remove  # or "set"

Updated `.gitignore`:

Removed *.log exclusion (to allow log directories)
Added launcher_config.json
Added cache/ directory
Added specific cache file exclusions

🚀 Migration Guide

For Existing Users:

No Breaking Changes: All existing functionality remains backward compatible
Optional New Features:
- Set rotation_tolerance for weighted random rotation (recommended: 3.0)
- Enable temperature override if experiencing tool hallucination
- Add Antigravity credentials for Gemini 3 access
Gemini CLI Users:
- Automatic tier detection will now occur
- Gemini 3 models require paid-tier credentials
- Set GEMINI_CLI_PROJECT_ID if using paid tier
New Environment Variables (all optional):
- ROTATION_TOLERANCE (default: 0.0)
- OVERRIDE_TEMPERATURE_ZERO (default: false)
- Antigravity-specific variables (if using provider)

📊 Statistics

Files Changed: 24
Lines Added: ~7,500+
Lines Removed: ~650
New Providers: 1 (Antigravity)
New Features: 7 major features
Documentation Updates: 4 files

Important

This PR adds an Antigravity provider, credential prioritization, and enhanced OAuth architecture to the LLM API Key Proxy, supporting Gemini 3 models and implementing a weighted random rotation strategy.

Antigravity Provider:
- New provider antigravity_provider.py with Gemini 3 support and advanced features like thought signature caching and tool hallucination prevention.
- Supports Gemini 2.5, Gemini 3, and Claude Sonnet 4.5 models.
- Implements OAuth 2.0 with stateless deployment capability.
Credential Prioritization:
- Providers implement get_credential_priority() and get_model_tier_requirement() for intelligent credential selection.
- Ensures optimal usage of paid-tier credentials for premium models.
Weighted Random Rotation:
- Configurable strategy in usage_manager.py for enhanced security and unpredictability.
- rotation_tolerance parameter controls randomness.
Enhanced OAuth Architecture:
- Refactored to google_oauth_base.py for shared logic across providers.
- Simplifies provider addition and ensures consistent behavior.
Miscellaneous:
- Adds provider_cache.py for shared caching system.
- Updates client.py and main.py for new features and configurations.
- Documentation updates in DOCUMENTATION.md and README.md.

^{This description was created by}^{for bd8f638. You can customize this summary. It will automatically update as commits are pushed.}

Add a new Antigravity provider and authentication base to integrate with the Antigravity (internal Google) API. - Add providers/antigravity_auth_base.py: OAuth2 token management with env/file loading, atomic saves, refresh logic, backoff/queue tracking, interactive and headless browser auth flow, and helper utilities. - Add providers/antigravity_provider.py: request/response transformations (OpenAI → Gemini CLI → Antigravity), model aliasing, thinking/reasoning config mapping, tool response grouping, streaming & non-streaming handling, and base-URL fallback. - Update provider_factory.py and providers/__init__.py to register the new provider. - Bump project metadata in pyproject.toml (package name and version). BREAKING CHANGE: project packaging metadata updated — package name changed to "rotator_library" and version bumped to 0.95. Update any dependency or packaging references that relied on the previous name/version.

…ng_content separation - Introduce Gemini 3 special mechanics in AntigravityProvider: - append a constant thoughtSignature into functionCall payloads to preserve Gemini reasoning continuity - filter out thoughtSignature parts from returned content to avoid exposing encrypted reasoning data - separate parts flagged with thought=true into a new reasoning_content field while keeping regular content in content - include thoughtsTokenCount in token accounting: prompt_tokens now includes reasoning tokens and reasoning_tokens are reported under completion_tokens_details.reasoning_tokens when present - Update comments, docstrings, and conversion logic to reflect Gemini 3 behavior - Rotate Antigravity OAuth client secret in AntigravityAuthBase

…token counting Add a per-request file logger and reasoning configuration mapping to the Antigravity provider and expose a token counting helper. - Introduce _AntigravityFileLogger to persist request payloads, streaming chunks, errors, and final responses under logs/antigravity_logs with timestamped directories. - Add optional enable_request_logging kwarg to completion flow to enable per-call file logging; wire logger through streaming and non-streaming handlers. - Log request payloads, raw response chunks, parse errors, and final unwrapped responses when enabled. - Add _map_reasoning_effort_to_thinking_config to map reasoning_effort ('low'|'medium'|'high'|'disable'|None) to Gemini thinkingConfig for gemini-2.5 and gemini-3 families (budgets/levels and include_thoughts). - Add count_tokens method that calls Antigravity :countTokens endpoint using transformed Gemini payloads and returns prompt/total token counts. - Add cautionary comment about Claude parametersJsonSchema handling requiring investigation. No behavioral breaking changes; new logging is opt-in via enable_request_logging and token counting is additive.

…budget toggle Introduce a consolidated mapping for reasoning effort targeted at Gemini 2.5 and Gemini 3 models: - Replace older duplicated logic with a single _map_reasoning_effort_to_thinking_config that detects gemini-2.5 vs gemini-3. - Gemini 2.5: map reasoning_effort to model-specific thinkingBudget values (pro/flash/fallback). Default auto = -1. Apply division by 4 unless kwargs['custom_reasoning_budget'] is True. - Gemini 3: use string thinkingLevel ("low" or "high"), default to "high" when unspecified and do not allow disabling thinking. - Return None for non-Gemini models to avoid changing other providers (e.g., Claude). - Propagate a new custom_reasoning_budget toggle from kwargs to the mapping call. - Add threading and os imports and remove the old obsolete mapping implementation. BREAKING CHANGE: Gemini 3 thinkingConfig format and defaults changed: - thinkingLevel is now a string ("low"/"high") instead of numeric levels. Update any code that inspects thinkingConfig thinkingLevel. - Default thinking behavior for Gemini 3 is now "high" when reasoning_effort is omitted. - The mapping function signature/behavior changed (added custom_reasoning_budget handling). If this method was called externally, update callers to pass the new parameter or rely on kwargs propagation.

…e thoughtSignature handling for Gemini 3 - Introduce ThoughtSignatureCache: TTL-based, thread-safe, auto-cleanup cache for mapping tool_call_id → thoughtSignature. - Integrate cache into AntigravityProvider and add env toggles: - ANTIGRAVITY_SIGNATURE_CACHE_TTL (default 3600s) - ANTIGRAVITY_PRESERVE_THOUGHT_SIGNATURES (client passthrough) - ANTIGRAVITY_ENABLE_SIGNATURE_CACHE (server-side caching) - Update message transformation to accept model and implement a 3-tier thoughtSignature fallback: 1. client-provided signature 2. server-side cache 3. bypass constant ("skip_thought_signature_validator") with warning for Gemini 3 - Fix Gemini → OpenAI chunk conversion: - Stop dropping function calls that include signatures (skip only standalone signature parts). - Store signatures into server cache and optionally include them in responses when passthrough is enabled. - Robustly parse tool responses, map finish reasons, and include reasoning token counts in usage. - Improve tool response grouping and id generation; add informative logging for signature-preservation behavior

…tSignature and decouple cache/passthrough Enforce Gemini 3 behavior where only the first tool call in parallel receives a thoughtSignature. Previously caching and client passthrough were coupled and could result in multiple signatures being stored or passed. This change: - add a first_signature_seen flag to ensure only the first tool call gets the signature - store signature in server-side cache only when _enable_signature_cache is true - pass signature to the client only when _preserve_signatures_in_client is true - preserve logging when a signature is stored in cache

…y aliasing Add "claude-sonnet-4-5" and "claude-sonnet-4-5-thinking" to HARDCODED_MODELS and simplify the alias mappings by removing explicit alias entries for these Claude models since their public names match internal names. This ensures the provider recognizes the new Claude Sonnet variants and avoids incorrect alias translations.

- Add providers/google_oauth_base.py to centralize Google OAuth logic (auth flow, token refresh, env loading, atomic saves, backoff/retry, queueing, headless support, and validation). - Migrate GeminiAuthBase and AntigravityAuthBase to inherit from GoogleOAuthBase and expose provider-specific constants (CLIENT_ID, CLIENT_SECRET, OAUTH_SCOPES, ENV_PREFIX, CALLBACK_PORT, CALLBACK_PATH). - Register "antigravity" in DEFAULT_OAUTH_DIRS and mark it as OAuth-only in credential_tool; include a user-friendly display name for interactive flows. - Remove large duplicated OAuth implementations from provider-specific files and consolidate behavior to reduce maintenance surface and ensure consistent token handling.

…_token helper Add opt-in dynamic model discovery controlled by ANTIGRAVITY_ENABLE_DYNAMIC_MODELS (default: false) to avoid relying on an unstable endpoint. When disabled, the provider returns the hardcoded model list; when enabled, it attempts to fetch models from the API and applies alias mappings. Add clear logging for enabled/disabled states and dynamic discovery results. Also introduce an async get_valid_token helper that loads credentials, refreshes expired tokens, and returns a valid access token for OAuth-style credential paths. - New env var: ANTIGRAVITY_ENABLE_DYNAMIC_MODELS (false by default) - Dynamic discovery returns discovered models prefixed with "antigravity/" - Hardcoded fallback now returns names prefixed with "antigravity/" - Added logs to indicate discovery mode and failures - Added async get_valid_token(credential_identifier) to centralize token refresh/load BREAKING CHANGE: Model names returned by the provider are now namespaced with the "antigravity/" prefix (e.g., "antigravity/xyz"). Update consumers to handle the new prefixed names or strip the prefix as needed. Dynamic discovery is disabled by default; enable it with ANTIGRAVITY_ENABLE_DYNAMIC_MODELS=true if desired.

…edential save - Handle system prompt content as either string or list and strip Claude-specific cache_control fields to avoid 400 errors - Safely parse tool content (JSON or raw) and wrap function responses consistently - Treat merged function response role as "user" to match Antigravity expectations - Add tool_call index for OpenAI streaming format and track index for parallel tool calls - Strip provider prefix from model names and add streaming query param (?alt=sse) when streaming - Include Host and User-Agent headers, set Accept based on streaming, and log error response bodies for easier debugging - Convert OpenAI-style chunks into litellm.ModelResponse objects before yielding in stream handler - Make credential persistence in Gemini CLI provider async (await _save_credentials)

…nd strip unsupported fields Remove dependency on _build_vertex_schema and align tool handling with the Go reference implementation. For function-type tools, build a function declaration with name, description, and a parametersJsonSchema field: - copy parameters when present and remove OpenAI-specific keys (`$schema`, `strict`); - default to an empty object schema when parameters are missing; - avoid mutating the original parameters and embed the declaration in `functionDeclarations`. This ensures Antigravity-compatible tool payloads and fixes schema/compatibility issues when passing tool definitions.

…mas, and fix Gemini tool conversion - Rename _normalize_json_schema → _normalize_type_arrays and convert JSON Schema "type" arrays (e.g. ["string","null"]) to a single non-null type to avoid protobuf "non-repeating" errors. - Add recursive Claude-specific schema cleaner and rename parametersJsonSchema → parameters for claude-sonnet-* models, stripping incompatible fields that break Claude validation. - Ensure thoughtSignature preservation logic remains with proper first-seen handling. - Inline generation of project/request IDs when fetching models. - Replace Vertex helper usage when building Gemini tool declarations: copy/clean parameters, set a safe default parametersJsonSchema, and call _normalize_type_arrays for compatibility.

…ignature handling to gemini-3 Add "id" to functionCall and response objects required by Antigravity/Claude integrations. Restrict preservation/insertion of thoughtSignature to Gemini 3 models only: prefer client-provided signature, fall back to the server-side cache when enabled, and finally use the bypass constant "skip_thought_signature_validator". Emit a warning when a Gemini 3 tool call lacks a signature. Avoid adding thoughtSignature for Claude and other models to prevent sending unsupported fields.

Add an environment-controlled override that modifies requests with `temperature: 0` for chat completions when `OVERRIDE_TEMPERATURE_ZERO` is enabled (default: "false"). - Supported modes: "remove" — delete the `temperature` key; "set"/"true"/"1"/"yes" — set temperature to 1.0. - Rationale: temperature=0 makes models overly deterministic and can cause tool hallucination; the override helps mitigate that when toggled. - Emits debug logs when an override is applied.

…tem-instruction) to reduce tool hallucination Introduce a configurable "Gemini 3" catch-all fix that enforces schema-driven tool usage and reduces tool hallucination by: - adding env-configurable flag ANTIGRAVITY_GEMINI3_TOOL_FIX (default ON) and related vars for prefix, description prompt, and system instruction - implementing namespace prefixing for tool names to break model training associations - injecting strict parameter signatures into tool descriptions to force schema adherence - prepending configurable system instructions for Gemini-3 models to override training-data assumptions - normalizing request/response names (prefix/strip) and preserving function call ids for API consistency - applying transformations only for gemini-3-* models and logging configuration details This change improves robustness when calling external tools by making tool schemas explicit to the model.

Implement dual-TTL caching system with async disk persistence to improve thoughtSignature handling across server restarts and long-running sessions. - Add disk persistence using atomic file writes with tempfile pattern for data integrity - Implement dual-TTL system: 1-hour memory cache, 24-hour disk cache - Create background async tasks for periodic disk writes and memory cleanup - Add disk fallback mechanism for cache misses (loads from disk into memory) - Introduce cache statistics tracking (memory hits, disk hits, misses, writes) - Add graceful shutdown with pending write flush - Convert cache operations from threading.Lock to asyncio.Lock for async support - Add environment variables for configurable write/cleanup intervals - Implement secure file permissions (0o600) for cache files - Add comprehensive logging for cache lifecycle events The cache now survives server restarts and provides better support for multi-turn conversations by persisting thoughtSignatures to disk. Memory cache expires after 1 hour to prevent unbounded growth, while disk cache persists for 24 hours to support longer conversation sessions.

… in tool args - Extend reasoning/thinking mapping to include Claude alongside Gemini 2.5 and Gemini 3: - Claude now uses `thinkingBudget` (same handling as Gemini 2.5, including pro budgets). - Gemini 3 continues to use `thinkingLevel`. - Add a static helper `_recursively_parse_json_strings` to detect and parse JSON-stringified values returned by Antigravity (e.g., `{"files": "[{...}]"}`) and recursively restore proper structures. - Use parsed arguments before `json.dumps()` when building tool call payloads to prevent double-encoding and JSON parsing errors from Antigravity responses. - Update .gitignore to add `launcher_config.json` and `cache/antigravity/thought_signatures.json` and remove the previous `*.log` ignore entry.

…ravity cache handling - Split the single signature cache into separate files: `GEMINI3_SIGNATURE_CACHE_FILE` and `CLAUDE_THINKING_CACHE_FILE`. - Replace `ThoughtSignatureCache` with `AntigravityCache`; disk persistence file is now passed via a `cache_file` constructor argument and in-memory entries are keyed by generic cache keys. - Introduce a stable key generator (`_generate_thinking_cache_key`) that combines tool call IDs and text hashes for Claude thinking caching. - Add separate caches for Gemini 3 signatures (`_signature_cache`) and Claude thinking content (`_thinking_cache`), and wire caching into both streaming and non-streaming flows. - Accumulate reasoning content, tool calls, and the final `thoughtSignature` during streaming (via `stream_accumulator`) and persist complete Claude thinking after the stream (`_cache_claude_thinking_after_stream`). - Inject cached Claude "thinking" parts into assistant messages when available (with signature fallback handling). - Use tool-provided IDs when present (fall back to generated `call_<uuid>` IDs), fix skipping logic for signature-only parts, and accumulate tool calls/text for reliable cache keys. - Adjust reasoning budget division from `// 4` to `// 6` to reduce default thinking budget. - Update `_gemini_to_openai_chunk` signature to accept an optional `stream_accumulator` and propagate accumulator through streaming logic. BREAKING CHANGE: `ThoughtSignatureCache` has been removed/renamed to `AntigravityCache` and its constructor now requires a `cache_file: Path` argument. Update any external imports/usages: - Replace `ThoughtSignatureCache(...)` with `AntigravityCache(cache_file=GEMINI3_SIGNATURE_CACHE_FILE|CLAUDE_THINKING_CACHE_FILE, memory_ttl_seconds=..., disk_ttl_seconds=...)`. - New cache constants `GEMINI3_SIGNATURE_CACHE_FILE` and `CLAUDE_THINKING_CACHE_FILE` were added; ensure integrations use the new names if relying on disk cache paths.

… tier-based onboarding This commit refactors the project discovery logic to strictly follow the official Gemini CLI behavior, fixing critical issues with paid tier support and free tier onboarding. Key changes: - Implement proper discovery flow: cache → configured override → persisted credentials → loadCodeAssist check → tier-based onboarding → fallback - Fix paid tier support: paid tiers now correctly use configured project_id instead of server-managed projects - Fix free tier onboarding: free tier correctly passes cloudaicompanionProject=None for server-managed projects - Add comprehensive tier detection logic: check currentTier from server response and respect userDefinedCloudaicompanionProject flag - Improve error handling: add specific error messages for 412 (precondition failed) and better guidance for missing project_id on paid tiers - Add detailed debug logging: log all tier information, server responses, and decision flow for troubleshooting - Add paid tier visibility: log paid tier usage on each request for transparency - Remove noisy debug logging: disable verbose chunk conversion logs The previous implementation incorrectly assumed all users should use server-managed projects and failed to properly distinguish between free tier (server-managed) and paid tier (user-provided) project handling. This caused 403/412 errors for paid users and incorrect onboarding flow for free users.

… organization and documentation This is a major refactoring of the Antigravity provider implementation that significantly improves code structure, readability, and maintainability without changing functionality. Key improvements: - Reorganized code into logical sections with clear separators (configuration, utilities, caching, transformations, API interface) - Consolidated helper functions with consistent naming patterns (underscore prefix for internal methods) - Simplified complex methods by extracting reusable components (e.g., _parse_content_parts, _extract_tool_call, _format_type_hint) - Enhanced documentation with comprehensive module docstring explaining features and capabilities - Streamlined environment variable handling with dedicated helper functions (_env_bool, _env_int) - Improved type hints and method signatures for better IDE support - Reduced code duplication in message transformation logic - Consolidated tool schema transformations into focused methods - Better separation of concerns between streaming and non-streaming response handling - Standardized error handling and logging patterns - Improved cache implementation with clearer separation of responsibilities The refactoring maintains full backward compatibility while making the codebase significantly easier to understand, test, and extend. All existing features including Gemini 3 thoughtSignature preservation, Claude thinking caching, tool hallucination prevention, and base URL fallback remain fully functional.

…module Extracted the AntigravityCache class into a new shared ProviderCache module to eliminate code duplication and improve maintainability across providers. - Created src/rotator_library/providers/provider_cache.py with generic, reusable cache implementation - Removed 266 lines of cache-specific code from antigravity_provider.py - Updated AntigravityProvider to use ProviderCache for both signature and thinking caches - Added configurable env_prefix parameter for flexible environment variable namespacing - Improved cache naming with _cache_name for better logging context - Added convenience factory function create_provider_cache() for streamlined cache creation - Removed unused imports (shutil, tempfile) from antigravity_provider.py - Updated .gitignore to include cache/ directory The new ProviderCache maintains full backward compatibility with the previous AntigravityCache implementation while providing a more modular, reusable foundation for other providers.

…automatic -thinking mapping This commit streamlines the handling of Claude Sonnet 4.5 model variants by automatically mapping the base model to its -thinking variant when reasoning_effort is provided. - Remove explicit "claude-sonnet-4-5-thinking" from AVAILABLE_MODELS list - Add inline documentation explaining internal mapping behavior - Implement automatic model variant selection in _transform_to_antigravity_format based on reasoning_effort parameter - Thread reasoning_effort parameter through generate_content call chain - Check for base claude-sonnet-4-5 model and append "-thinking" suffix when reasoning_effort is present This improves the API surface by reducing redundant model options while maintaining full functionality through intelligent runtime model selection.

…ure caching This commit integrates comprehensive support for `gemini-3-pro-preview`, addressing specific requirements for reasoning models and tool reliability. - Update `AntigravityProvider` and `GeminiCliProvider` model lists to prioritize Gemini 3. - Implement a "Tool Fix" mechanism to prevent parameter hallucinations: - Inject strict parameter signatures and type hints into tool descriptions. - Add specific system instructions to enforce schema adherence. - Apply `gemini3_` namespace prefixing to isolate tool contexts. - Integrate `ProviderCache` to persist `thoughtSignature` values, ensuring reasoning continuity during tool execution. - Refactor `_handle_reasoning_parameters` to support Gemini 3's `thinkingLevel` (string) alongside Gemini 2.5's `thinkingBudget` (integer). - Add environment variable configuration for cache TTL and feature flags.

…quest payload The `model` and `project` parameters were being incorrectly included at the top level of the request payload. These fields are not part of the Gemini API request body structure and should only be used for endpoint construction or authentication context.

…g for Antigravity - Change reasoning parameters log from info to debug level in main.py - Move reasoning parameters logging outside logger conditional block for consistent monitoring - Enhance _clean_claude_schema documentation to clarify it's for Antigravity/Google's Proto-based API - Add support for converting 'const' to 'enum' with single value in schema cleaning - Improve code organization with better comments explaining unsupported fields These changes improve logging granularity and enhance JSON Schema compatibility with Antigravity's Proto-based API requirements.

…model switches This commit introduces intelligent handling of Claude's thinking mode when switching models mid-conversation during incomplete tool use loops. **New Features:** - Auto-detection of incomplete tool turns (when messages end with tool results without assistant completion) - Configurable turn completion injection via `ANTIGRAVITY_AUTO_INJECT_TURN_COMPLETION` (default: true) - Configurable thinking mode suppression via `ANTIGRAVITY_AUTO_SUPPRESS_THINKING` (default: false) - Customizable turn completion placeholder text via `ANTIGRAVITY_TURN_COMPLETION_TEXT` (default: "...") **Implementation Details:** - `_detect_incomplete_tool_turn()`: Analyzes message history to identify incomplete tool use patterns - `_inject_turn_completion()`: Appends a synthetic assistant message to close incomplete turns - `_handle_thinking_mode_toggle()`: Orchestrates the toggling strategy based on configuration **Behavior:** When switching to Claude with thinking mode enabled during an incomplete tool loop: 1. If auto-injection is enabled: Inject a completion message to allow thinking mode 2. If auto-suppression is enabled: Disable thinking mode to prevent API errors 3. If both disabled: Allow the request to proceed (likely resulting in API error) This resolves API compatibility issues when transitioning between models with different conversation state requirements.

The generic key handling logic was incorrectly concatenating the 'role' field when processing streaming message chunks. The role field should always be replaced with the latest value, not concatenated like content fields. This fix adds an explicit check to ensure the 'role' key is always overwritten rather than appended to, preventing malformed role values in the final message object.

Antigravity sometimes returns malformed JSON strings with extra trailing characters (e.g., '[{...}]}' instead of '[{...}]'). This enhancement extends the JSON parsing logic to automatically detect and correct such malformations by: - Detecting JSON-like strings that don't have proper closing delimiters - Finding the last valid closing bracket/brace and truncating extra characters - Logging warnings when auto-correction is applied for debugging purposes - Recursively parsing the corrected JSON structures This prevents parsing failures when Antigravity returns double-encoded or malformed JSON in tool arguments.

…dentials The `_get_provider_instance` method now checks if credentials exist for a provider before attempting initialization. This prevents potential errors from initializing providers that lack proper configuration. - Added credential existence check at the start of the method - Returns `None` early if provider credentials are not configured - Added debug logging to indicate when provider initialization is skipped - Enhanced docstring with detailed Args and Returns documentation This change improves system robustness by failing gracefully when providers are referenced but not properly configured.

This commit removes the thinking mode toggling functionality that was previously used to handle model switches mid-conversation when tool use loops were incomplete. - Removed `_detect_incomplete_tool_turn`, `_inject_turn_completion`, and `_handle_thinking_mode_toggle` helper methods - Removed environment variable configuration for turn completion behavior (`ANTIGRAVITY_AUTO_INJECT_TURN_COMPLETION`, `ANTIGRAVITY_AUTO_SUPPRESS_THINKING`, `ANTIGRAVITY_TURN_COMPLETION_TEXT`) - Removed thinking mode toggle logic from `acompletion` method - Added provider prefix to JSON auto-correction warning log for better debugging The removed feature was designed to automatically handle incomplete tool use loops when switching to Claude models with thinking mode enabled, but was buggy as hell.

… failures This commit improves the robustness of OAuth token refresh operations in both IFlowAuthBase and QwenAuthBase by implementing failure tracking with exponential backoff and credential validation. - Track refresh failures per credential path using `_refresh_failures` dictionary - Implement exponential backoff (30s * 2^failures, max 5 minutes) to prevent rapid retry loops on persistent failures - Clear backoff state on successful authentication or refresh - Add validation to ensure refreshed credentials contain required fields (access_token, refresh_token, and api_key for iFlow) - Update proactively_refresh to support env:// virtual paths for environment-based OAuth credentials - Add detailed debug logging for backoff timer settings The backoff mechanism prevents excessive API calls when refresh tokens are invalid or services are temporarily unavailable, while the validation ensures credential integrity after refresh operations.

…alization in stream reassembly This commit addresses critical issues in the streaming response reassembly logic across multiple providers (Gemini CLI, iFlow, and Qwen Code): - Implements priority-based finish_reason determination: tool_calls > chunk's finish_reason (length, content_filter, etc.) > stop - Properly initializes aggregated_tool_calls with "type": "function" field for OpenAI compatibility - Tracks chunk_finish_reason separately to preserve provider-specific finish reasons (e.g., content_filter, length limits) - Uses safer .get("index", 0) for tool call index extraction to prevent KeyErrors - Adds explicit type field handling during tool call aggregation - Improves docstring documentation explaining the reassembly logic - Moves copy import to top-level in iflow_provider.py and qwen_code_provider.py for consistency CRITICAL FIX for qwen_code_provider.py: Handles chunks with BOTH usage and choices data (typical for final chunk) without early return, ensuring finish_reason is properly captured before yielding usage data separately.

The .env file was being loaded after attempting to read PROXY_API_KEY from environment variables, causing the key to be unavailable for display during startup. Moving the dotenv.load_dotenv() call earlier in the initialization sequence ensures environment variables are loaded before they are accessed.

Introduces a comprehensive provider-specific settings management system for Antigravity and Gemini CLI providers with detection, display, and interactive configuration capabilities. - Add `PROVIDER_SETTINGS_MAP` with detailed definitions for Antigravity (12 settings) and Gemini CLI (8 settings) including signature caching, tool fixes, and provider-specific parameters - Implement `ProviderSettingsManager` class for managing provider settings with type-aware value parsing and modification tracking - Add `detect_provider_settings()` method to `SettingsDetector` to identify modified provider settings from environment variables - Integrate provider settings detection into launcher TUI summary display and detailed advanced settings view - Add new menu option (4) in settings tool for provider-specific configuration management - Implement interactive TUI for browsing, editing, and resetting individual or all provider settings with visual indication of modified values - Display provider settings status in launcher with count of modified settings per provider - Support bool, int, and string setting types with appropriate input handling and validation

…ere anyway)

Restructured the Antigravity provider description in the README for better clarity and readability: - Converted the dense paragraph into a structured bullet list highlighting key features - Separated thought signature caching, tool hallucination prevention, and thinking block sanitization into distinct points - Replaced the informal troubleshooting note with a concise reference to dedicated documentation - Added direct link to Antigravity documentation section for Claude extended thinking sanitization details This change improves the discoverability of Antigravity's advanced features and provides a clearer path for users to understand Claude Sonnet 4.5 thinking mode limitations.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed efbd008 in 1 minute and 36 seconds. Click for details.

Reviewed 17 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. README.md:31

Draft comment:
Refined the Antigravity Provider description – the bullet list clearly outlines advanced features and removes informal language. Verify that the linked documentation covers all details on Claude Sonnet 4.5 state management.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 1% None

Workflow ID: wflow_WenuYCxlyW35jJYX

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

…limit - Add `*.env` to `.gitignore` to prevent accidentally committing environment variables containing sensitive data - Increase `DEFAULT_MAX_OUTPUT_TOKENS` from 16384 to 32384 in Antigravity provider to allow for longer model outputs

…ding and bulk export tools This commit introduces comprehensive support for loading OAuth credentials from environment variables alongside file-based credentials, and adds powerful bulk export/combine functionality for all credential types. Main changes: - **Environment-based credentials**: Modified main.py to load all *.env files from the root directory, enabling credentials to be stored in environment variables with an "env://" virtual path scheme - **Safe metadata handling**: Added checks throughout to skip file I/O operations for env-based credentials (they use virtual paths and don't have metadata files) - **Optimized credential discovery**: Updated RotatingClient to accept pre-discovered credentials from main.py, avoiding redundant discovery calls - **Bulk export tools**: Added `export_all_provider_credentials()` to export all credentials for a specific provider to individual .env files - **Credential combining**: Added `combine_provider_credentials()` to merge all credentials for a provider into a single .env file, and `combine_all_credentials()` to create one master .env file with all providers - **Enhanced export menu**: Expanded the credential export submenu with 13 options covering individual exports, bulk exports per provider, and various combining strategies - **Provider support**: Added helper functions `_build_gemini_cli_env_lines()`, `_build_qwen_code_env_lines()`, `_build_iflow_env_lines()`, and `_build_antigravity_env_lines()` for consistent .env file generation These changes enable flexible credential management, allowing users to store credentials as files or environment variables, and providing powerful tools to export and combine credentials for deployment scenarios.

ellipsis-dev

Caution

Changes requested ❌

Reviewed bd8f638 in 2 minutes and 56 seconds. Click for details.

Reviewed 473 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 7 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. src/rotator_library/client.py:66

Draft comment:
Rotation tolerance parameter set to 3.0 appears appropriate. Ensure that documentation (e.g. README) clearly explains recommended ranges for production use.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

2. src/rotator_library/client.py:752

Draft comment:
Credential prioritization logic using get_model_tier_requirement and get_credential_priority is well integrated. Consider clarifying (in comments or docs) the behavior when a credential’s priority is unknown (None).
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

3. src/rotator_library/client.py:1195

Draft comment:
The streaming completion retry logic similarly applies model tier filtering. Consider refactoring common filtering logic between streaming and non‐streaming paths to reduce code duplication.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

4. src/rotator_library/credential_tool.py:50

Draft comment:
The helper _build_env_export_content is well structured for generating .env lines. It may help to sanitize values (e.g. email addresses) to avoid issues with special characters.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

5. src/rotator_library/credential_tool.py:110

Draft comment:
In ensure_env_defaults, a default PROXY_API_KEY ('VerysecretKey') is set. Ensure that users are clearly warned that this default is insecure for production environments.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

6. src/rotator_library/credential_tool.py:130

Draft comment:
The hardcoded provider list in setup_api_key is extensive. Consider externalizing these settings (e.g. in a config file) for easier updates and maintenance.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

7. src/rotator_library/credential_tool.py:792

Draft comment:
The export and combine credential functions offer flexible export options. Consider adding additional error handling around file I/O operations to catch and report permission or disk errors.
Reason this comment was not posted:
Comment looked like it was already resolved.

Workflow ID: wflow_Pm3cUd40zCdw7kR3

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-11-27T17:31:46Z

src/proxy_app/main.py

-                except Exception as e:
-                    logging.error(f"Failed to update metadata for '{path}': {e}")
+                # Update metadata (skip for env-based credentials - they don't have files)
+                if not path.startswith("env://"):


Consider defining a constant for 'env://' to improve maintainability when skipping metadata update for env-based credentials.

Introduces a new model information service that fetches pricing and capability data from external catalogs (OpenRouter and Models.dev) to enrich the /v1/models endpoint and enable cost estimation. - Implements ModelRegistry class with async background data fetching to avoid blocking proxy startup - Adds fuzzy model ID matching with multi-source data aggregation - Expands /v1/models endpoint with optional enriched response containing pricing, token limits, and capability flags - Adds new endpoints: GET /v1/models/{model_id}, GET /v1/model-info/stats, POST /v1/cost-estimate - Supports per-token pricing for input, output, cache read, and cache write operations - Integrates with lifespan management for proper service initialization and cleanup - Includes comprehensive backward compatibility layer for gradual migration The service refreshes data every 6 hours (configurable via MODEL_INFO_REFRESH_INTERVAL) and runs asynchronously to maintain fast proxy initialization times.

ellipsis-dev · 2025-11-27T18:27:00Z